Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 1022420150070040009
Phonetics and Speech Sciences
2015 Volume.7 No. 4 p.9 ~ p.16
Performance Comparison of Deep Feature Based Speaker Verification Systems
Kim Dae-Hyun

Seong Woo-Kyeong
Kim Hong-Kook
Abstract
In this paper, several experiments are performed according to deep neural network (DNN) based features for the performance comparison of speaker verification (SV) systems. To this end, input features for a DNN, such as mel-frequency cepstral coefficient (MFCC), linear-frequency cepstral coefficient (LFCC), and perceptual linear prediction (PLP), are first compared in a view of the SV performance. After that, the effect of a DNN training method and a structure of hidden layers of DNNs on the SV performance is investigated depending on the type of features. The performance of an SV system is then evaluated on the basis of I-vector or probabilistic linear discriminant analysis (PLDA) scoring method. It is shown from SV experiments that a tandem feature of DNN bottleneck feature and MFCC feature gives the best performance when DNNs are configured using a rectangular type of hidden layers and trained with a supervised training method.
KEYWORD
speaker verification, deep neural network, tandem feature
FullTexts / Linksout information
Listed journal information
ÇмúÁøÈïÀç´Ü(KCI)